On Jun 2, 2:38 am, Henri <yeah_ri...@donteventry.comwrote:
How would one go about comparing 2 strings one of which may contain
special entities (eg "cassé" and "cassé")?
Unless there is some Google Groups server "optimization" here, I see
in the first case a word containing character e accent aigue and in
the second case a word containing numeric HTML entity "#233". In such
case these are two completely different issues here.
Javascript operates in Unicode, so it internally sees any string
literal as a Unicode sequence, no matter what the actual page encoding
is. If you need to sort and transform strings according to current
locale, use locale-specific string manipulation methods:
string1.localeCompare(string2)
and
toLocaleLowerCase()
toLocaleUpperCase()
In the second case (with HTML entity) it all depends from were are you
retrieving this string. If you are getting it from the content of a
loaded page, then by the time you are retrieving it the entities are
already parsed so for Javascript it is the same Unicode string as in
the first case, so you don't need to bother with extra transformation.
If it is a string literal "cassé" then obviously for Javascript
it is just a character sequence "c-a-s-s-&-#-2-3-3-;" and it has
nothing to do with "cassé". In this case either use RegExp to replace
entities by custom table; or insert the string into (hidden) HTML
element and read back the parsed value.